Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
Bayesian Inference offers principled tools to tackle many critical problems with modern neural networks such as poor calibration and generalization, and data inefficiency. However, scaling Bayesian inference to large architectures is challenging and requires restrictive approximations. Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference and to estimate uncertainty with deep neural networks. Traditionally, the dropout mask is sampled independently from a fixed distribution. Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference. These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation. In this work, we propose GFlowOut to address these issues. GFlowOut leverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks. We empirically demonstrate that GFlowOut results in predictive distributions that generalize better to out-of-distribution data, and provide uncertainty estimates which lead to better performance in downstream tasks.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
给定来自动态图的图形边缘,我们如何以在线方式将异常得分分配给边缘和子图,以便使用恒定的时间和内存来检测异常行为?例如,在入侵检测中,现有工作试图检测异常的边缘或异常子图,但并非两者兼而有之。在本文中,我们首先将Count-Min草图数据结构扩展到高阶草图。该高阶草图具有保留密集的子图结构的有用属性(输入中的密集子图转换为数据结构中的密集子膜)。然后,我们提出了4种利用这种增强数据结构的在线算法,该算法(a)检测边缘和图异常; (b)在恒定内存和每个新到达边缘的恒定内存和恒定更新时间中处理每个边缘,并且; (c)在4个现实世界数据集上优于最先进的基线。我们的方法是第一种流媒体方法,该方法结合了密集的子图搜索以在恒定内存和时间中检测图形异常。
translated by 谷歌翻译
In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier. Previous methods for such combined clustering and classification either 1) are classifier-specific and not generic, or 2) independently perform clustering and classifier training, which may not form clusters that can potentially benefit classifier performance. The question of how to perform clustering to improve the performance of classifiers trained on the clusters has received scant attention in previous literature, despite its importance in several real-world applications. In this paper, first, we theoretically analyze the generalization performance of classifiers trained on clustered data and find conditions under which clustering can potentially aid classification. This motivates the design of a simple k-means-based classification algorithm called Clustering Aware Classification (CAC) and its neural variant {DeepCAC}. DeepCAC effectively leverages deep representation learning to learn latent embeddings and finds clusters in a manner that make the clustered data suitable for training classifiers for each underlying subpopulation. Our experiments on synthetic and real benchmark datasets demonstrate the efficacy of DeepCAC over previous methods for combined clustering and classification.
translated by 谷歌翻译
在许多机器学习应用中,对于模型而言,提供置信分数以准确捕获其预测不确定性非常重要。尽管现代学习方法在预测准确性方面取得了巨大的成功,但产生校准的置信度得分仍然是一个重大挑战。基于采用凸面的培训示例组合的一种流行而简单的数据增强技术,已被经验发现可显着改善各种应用程序之间的置信度校准。但是,混音何时以及如何帮助校准仍然是一个谜。在本文中,我们从理论上证明,混合通过研究自然统计模型来改善\ textit {高维}设置中的校准。有趣的是,随着模型容量的增加,混合的校准益处会增加。我们通过对共同体系结构和数据集的实验来支持我们的理论。此外,我们研究混合如何改善半监督学习的校准。在合并未标记的数据的同时,有时可以使模型降低校准,从而增加混合训练可以减轻此问题并证明可以改善校准。我们的分析提供了新的见解和一个框架,以了解混合和校准。
translated by 谷歌翻译